fix(security): datamarking substitutes zero-width formatting characters (closes #215)#231
Merged
Merged
Conversation
…rs (closes #215) PR #214 used `char::is_whitespace()` as the datamarking classifier. That predicate follows the Unicode `White_Space` property, which excludes zero-width formatting codepoints (ZWSP `U+200B`, ZWNJ `U+200C`, ZWJ `U+200D`, WJ `U+2060`, BOM `U+FEFF`). Those codepoints are documented prompt-injection vectors used to smuggle invisible instructions inside otherwise-benign Data zones, so they were passing through the transform unchanged. Add `is_substitutable_whitespace(c)` = `c.is_whitespace()` plus the five zero-width codepoints, and use it in the substitution loop in place of the bare `char::is_whitespace` call. Tests: - `zwsp_is_not_substituted_by_design` renamed/inverted to `zwsp_is_substituted` (now asserts ZWSP IS replaced). - New per-codepoint coverage: zwnj/zwj/word_joiner/bom. - `mixed_whitespace_classes_all_substituted` left unchanged (still validates the Unicode `White_Space` set). - `idempotence_apply_twice_equals_apply_once` extended with a mixed ordinary + zero-width input to exercise both classifier branches; second pass remains a no-op.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #215. Follow-up to PR #214 (IS-060 PR-2 datamarking transform).
The gap
PR #214 used
char::is_whitespace()as the datamarking classifier — per the PR-2 brief. Rust'sis_whitespacefollows the UnicodeWhite_Spaceproperty, which excludes the zero-width / formatting codepoints commonly used as invisible prompt-injection vectors:U+200B(ZERO WIDTH SPACE)U+200CU+200DU+2060(WORD JOINER)U+FEFFThese passed through the Data-zone transform unchanged, allowing an attacker to smuggle invisible instructions inside an otherwise-marked Data zone.
The fix
Introduce a small predicate that augments
char::is_whitespacewith the five zero-width codepoints, and route the substitution loop through it:This is Option 1 from issue #215 — the minimal-risk path. The Unicode whitespace surface is unchanged; we only widen the predicate by exactly the five zero-width codepoints called out as attack vectors. No public type or marker-selection logic changes.
Threat-model rationale
The Spotlighting datamarking guarantee is "any whitespace-equivalent gap in a Data zone is replaced by an out-of-band PUA marker so the model can distinguish data from instructions." If invisible-character smuggling can introduce gaps the predicate does not see, the guarantee is broken — the attacker can structure invisible-character "words" the model still parses as instructions. Treating zero-width formatting codepoints as substitutable whitespace closes that hole without expanding the marker contract.
Testing
zwsp_is_not_substituted_by_designrenamed/inverted tozwsp_is_substituted— now asserts ZWSP IS replaced (and validates the byte_delta: ZWSP and U+E000 are both 3 bytes UTF-8 → delta 0).zwnj_is_substituted,zwj_is_substituted,word_joiner_is_substituted,bom_is_substituted.mixed_whitespace_classes_all_substitutedretained — the existing UnicodeWhite_Spaceset (space, tab, newline, NBSP, VT, FF) still substitutes unchanged.idempotence_apply_twice_equals_apply_onceextended with a mixed input containing both ordinary whitespace and zero-width codepoints ("hello world\nfoo\u{200B}bar\u{FEFF}baz"); second pass remains a no-op with zero byte_delta.Verification
cargo fmt --all --checkcleancargo clippy --workspace -- -D warningscleancargo test -p llmtrace-security595/595 passNote: a workspace-wide
cargo test --workspaceflaggedllmtrace-proxy::tests::test_debug_verdicts_returns_404_when_flag_offas failing under parallel contention (502 vs expected 404), but the same test passes in isolation. That test does not touch datamarking and the failure reproduces onmainunder the same parallel conditions — it is unrelated to this change.Test plan
fix/is-060-datamarking-zwspspotlighting_appliedwith non-zerobyte_deltaon the affected Data zonesllmtrace_spotlighting_marker_collision_totalremains flat (the predicate widening does not change marker-selection logic)Closes #215.